Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
1.
J Am Med Inform Assoc ; 25(3): 300-308, 2018 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-29346583

RESUMO

OBJECTIVE: Finding relevant datasets is important for promoting data reuse in the biomedical domain, but it is challenging given the volume and complexity of biomedical data. Here we describe the development of an open source biomedical data discovery system called DataMed, with the goal of promoting the building of additional data indexes in the biomedical domain. MATERIALS AND METHODS: DataMed, which can efficiently index and search diverse types of biomedical datasets across repositories, is developed through the National Institutes of Health-funded biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) consortium. It consists of 2 main components: (1) a data ingestion pipeline that collects and transforms original metadata information to a unified metadata model, called DatA Tag Suite (DATS), and (2) a search engine that finds relevant datasets based on user-entered queries. In addition to describing its architecture and techniques, we evaluated individual components within DataMed, including the accuracy of the ingestion pipeline, the prevalence of the DATS model across repositories, and the overall performance of the dataset retrieval engine. RESULTS AND CONCLUSION: Our manual review shows that the ingestion pipeline could achieve an accuracy of 90% and core elements of DATS had varied frequency across repositories. On a manually curated benchmark dataset, the DataMed search engine achieved an inferred average precision of 0.2033 and a precision at 10 (P@10, the number of relevant results in the top 10 search results) of 0.6022, by implementing advanced natural language processing and terminology services. Currently, we have made the DataMed system publically available as an open source package for the biomedical community.

2.
J Am Med Inform Assoc ; 24(2): 380-387, 2017 03 01.
Artigo em Inglês | MEDLINE | ID: mdl-27589942

RESUMO

Background: Implementation of patient preferences for use of electronic health records for research has been traditionally limited to identifiable data. Tiered e-consent for use of de-identified data has traditionally been deemed unnecessary or impractical for implementation in clinical settings. Methods: We developed a web-based tiered informed consent tool called informed consent for clinical data and bio-sample use for research (iCONCUR) that honors granular patient preferences for use of electronic health record data in research. We piloted this tool in 4 outpatient clinics of an academic medical center. Results: Of patients offered access to iCONCUR, 394 agreed to participate in this study, among whom 126 patients accessed the website to modify their records according to data category and data recipient. The majority consented to share most of their data and specimens with researchers. Willingness to share was greater among participants from an Human Immunodeficiency Virus (HIV) clinic than those from internal medicine clinics. The number of items declined was higher for for-profit institution recipients. Overall, participants were most willing to share demographics and body measurements and least willing to share family history and financial data. Participants indicated that having granular choices for data sharing was appropriate, and that they liked being informed about who was using their data for what purposes, as well as about outcomes of the research. Conclusion: This study suggests that a tiered electronic informed consent system is a workable solution that respects patient preferences, increases satisfaction, and does not significantly affect participation in research.


Assuntos
Pesquisa Biomédica/ética , Registros Eletrônicos de Saúde , Consentimento Livre e Esclarecido , Preferência do Paciente , Estudos de Viabilidade , Feminino , Humanos , Disseminação de Informação/ética , Masculino , Fatores Socioeconômicos
3.
J Natl Cancer Inst ; 109(2)2017 02.
Artigo em Inglês | MEDLINE | ID: mdl-27688295

RESUMO

Biospecimen donation is key to the Precision Medicine Initiative, which pioneers a model for accelerating biomedical research through individualized care. Personalized medicine should be made available to medically underserved populations, including the large and growing US Hispanic population. We present results of a study of 140 Hispanic women who underwent a breast biopsy at a safety-net hospital and were randomly assigned to receive information and request for consent for biospecimen and data sharing by the patient's physician or a research assistant. Consent rates were high (97.1% and 92.9% in the physician and research assistant arms, respectively) and not different between groups (relative risk [RR] = 1.05, 95% confidence interval [CI] = 0.96 to 1.10). Consistent with a small but growing literature, we show that perceptions of Hispanics' unwillingness to participate in biospecimen sharing for research are not supported by data. Safety-net clinics and hospitals offer untapped possibilities for enhancing participation of underserved populations in the exciting Precision Medicine Initiative.


Assuntos
Bancos de Espécimes Biológicos , Mama/patologia , Hispânico ou Latino , Disseminação de Informação , Consentimento Livre e Esclarecido , Adulto , Biópsia , Comportamento Cooperativo , Feminino , Humanos , Pessoa de Meia-Idade , Medicina de Precisão , Distribuição Aleatória , Provedores de Redes de Segurança , Populações Vulneráveis
4.
AMIA Annu Symp Proc ; 2016: 1880-1889, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-28269947

RESUMO

Natural Language Processing (NLP) is essential for concept extraction from narrative text in electronic health records (EHR). To extract numerous and diverse concepts, such as data elements (i.e., important concepts related to a certain medical condition), a plausible solution is to combine various NLP tools into an ensemble to improve extraction performance. However, it is unclear to what extent ensembles of popular NLP tools improve the extraction of numerous and diverse concepts. Therefore, we built an NLP ensemble pipeline to synergize the strength of popular NLP tools using seven ensemble methods, and to quantify the improvement in performance achieved by ensembles in the extraction of data elements for three very different cohorts. Evaluation results show that the pipeline can improve the performance of NLP tools, but there is high variability depending on the cohort.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Coleta de Dados , Humanos
5.
J Am Med Inform Assoc ; 22(6): 1187-95, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26142423

RESUMO

BACKGROUND: Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. OBJECTIVE: The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. MATERIALS AND METHODS: Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. RESULTS: The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. DISCUSSION AND CONCLUSION: Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks.


Assuntos
Pesquisa Comparativa da Efetividade/organização & administração , Redes de Comunicação de Computadores , Disseminação de Informação , Modelos Estatísticos , Software , Pesquisa Biomédica , Bases de Dados como Assunto , Armazenamento e Recuperação da Informação , Internet , Análise Multivariada
6.
Bioinformatics ; 30(19): 2826-7, 2014 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-24907367

RESUMO

SUMMARY: MAGI is a web service for fast MicroRNA-Seq data analysis in a graphics processing unit (GPU) infrastructure. Using just a browser, users have access to results as web reports in just a few hours->600% end-to-end performance improvement over state of the art. MAGI's salient features are (i) transfer of large input files in native FASTA with Qualities (FASTQ) format through drag-and-drop operations, (ii) rapid prediction of microRNA target genes leveraging parallel computing with GPU devices, (iii) all-in-one analytics with novel feature extraction, statistical test for differential expression and diagnostic plot generation for quality control and (iv) interactive visualization and exploration of results in web reports that are readily available for publication. AVAILABILITY AND IMPLEMENTATION: MAGI relies on the Node.js JavaScript framework, along with NVIDIA CUDA C, PHP: Hypertext Preprocessor (PHP), Perl and R. It is freely available at http://magi.ucsd.edu.


Assuntos
Biologia Computacional/métodos , Gráficos por Computador , MicroRNAs/análise , Análise de Sequência de RNA , Internet , Linguagens de Programação , Software
7.
Artigo em Inglês | MEDLINE | ID: mdl-24303320

RESUMO

The NIH-funded iDASH1 National Center for Biomedical Computing was created in 2010 with the goal of developing infrastructure, algorithms, and tools to integrate Data for Analysis, 'anonymization,' and SHaring. iDASH is based on the premise that, while a strong case for not sharing information to preserve individual privacy can be made, an equally compelling case for sharing genome information for the public good (i.e., to support new discoveries that promote health or alleviate the burden of disease) should also be made. In fact, these cases do not need to be mutually exclusive: genome data sharing on a cloud does not necessarily have to compromise individual privacy, although current practices need significant improvement. So far, protection of subject data from re-identification and misuse has been relying primarily on regulations such as HIPAA, the Common Rule, and GINA. However, protection of biometrics such as a genome requires specialized infrastructure and tools.

8.
J Am Med Inform Assoc ; 19(2): 196-201, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22081224

RESUMO

iDASH (integrating data for analysis, anonymization, and sharing) is the newest National Center for Biomedical Computing funded by the NIH. It focuses on algorithms and tools for sharing data in a privacy-preserving manner. Foundational privacy technology research performed within iDASH is coupled with innovative engineering for collaborative tool development and data-sharing capabilities in a private Health Insurance Portability and Accountability Act (HIPAA)-certified cloud. Driving Biological Projects, which span different biological levels (from molecules to individuals to populations) and focus on various health conditions, help guide research and development within this Center. Furthermore, training and dissemination efforts connect the Center with its stakeholders and educate data owners and data consumers on how to share and use clinical and biological data. Through these various mechanisms, iDASH implements its goal of providing biomedical and behavioral researchers with access to data, software, and a high-performance computing environment, thus enabling them to generate and test new hypotheses.


Assuntos
Algoritmos , Confidencialidade , Disseminação de Informação , Informática Médica , Previsões , Objetivos , Health Insurance Portability and Accountability Act , Armazenamento e Recuperação da Informação , Estados Unidos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...